Statistical analysis and modeling of mass spectrometry-based metabolomics data.
نویسندگان
چکیده
Multivariate statistical techniques are used extensively in metabolomics studies, ranging from biomarker selection to model building and validation. Two model independent variable selection techniques, principal component analysis and two sample t-tests are discussed in this chapter, as well as classification and regression models and model related variable selection techniques, including partial least squares, logistic regression, support vector machine, and random forest. Model evaluation and validation methods, such as leave-one-out cross-validation, Monte Carlo cross-validation, and receiver operating characteristic analysis, are introduced with an emphasis to avoid over-fitting the data. The advantages and the limitations of the statistical techniques are also discussed in this chapter.
منابع مشابه
Galaxy-M: a Galaxy workflow for processing and analyzing direct infusion and liquid chromatography mass spectrometry-based metabolomics data.
BACKGROUND Metabolomics is increasingly recognized as an invaluable tool in the biological, medical and environmental sciences yet lags behind the methodological maturity of other omics fields. To achieve its full potential, including the integration of multiple omics modalities, the accessibility, standardization and reproducibility of computational metabolomics tools must be improved signific...
متن کاملMapping of Chemical and Biochemical Relationships of Mass Spectrometry-based Metabolomics Data
UC Davis, Davis, CA Introduction Mass spectrometry-based metabolomic studies generate quantitative data of the differential regulation of metabolites in response to genetic, environmental or physiological perturbations. Improved algorithms, larger mass spectral libraries and better instrumentations enable the identification of a larger number of metabolites in unbiased GC-MS or LC-MS runs. Howe...
متن کاملA Guideline to Univariate Statistical Analysis for LC/MS-Based Untargeted Metabolomics-Derived Data
Several metabolomic software programs provide methods for peak picking, retention time alignment and quantification of metabolite features in LC/MS-based metabolomics. Statistical analysis, however, is needed in order to discover those features significantly altered between samples. By comparing the retention time and MS/MS data of a model compound to that from the altered feature of interest i...
متن کاملLocal false discovery rate estimation using feature reliability in LC/MS metabolomics data
False discovery rate (FDR) control is an important tool of statistical inference in feature selection. In mass spectrometry-based metabolomics data, features can be measured at different levels of reliability and false features are often detected in untargeted metabolite profiling as chemical and/or bioinformatics noise. The traditional false discovery rate methods treat all features equally, w...
متن کاملMetDAT: a modular and workflow-based free online pipeline for mass spectrometry data processing, analysis and interpretation
SUMMARY Analysis of high throughput metabolomics experiments is a resource-intensive process that includes pre-processing, pre-treatment and post-processing at each level of experimental hierarchy. We developed an interactive user-friendly online software called Metabolite Data Analysis Tool (MetDAT) for mass spectrometry data. It offers a pipeline of tools for file handling, data pre-processin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Methods in molecular biology
دوره 1198 شماره
صفحات -
تاریخ انتشار 2014